June 26, 2025
Thursday
All t-tests assume approximate normality of the data.
In the case of one-sample t-tests, the measure of interest must somewhat follow a normal distribution.
In the case of two-sample t-tests, the measure of interest in each group must somewhat follow a normal distribution.
Note that a paired t-test is technically a one-sample t-test, so we will examine normality of the difference.
There are formal tests for normality (see article here), however, we will not use them.
Instead, we will assess normality using a quantile-quantile (Q-Q) plot.
A Q-Q plot helps us visually check if our data follows a specific distribution (here, the normal).
How do we read Q-Q plots?
wing_flap %>% independent_mean_HT(grouping = target,
continuous = apples,
mu = 5,
alternative = "greater",
alpha = 0.05)Two-sample t-test for two independent means and equal variance:
Null: H₀: μ₁ − μ₂ = 5
Alternative: H₁: μ₁ − μ₂ > 5
Test statistic: t(23) = 5.445
p-value: p < 0.001
Conclusion: Reject the null hypothesis (p = < 0.001 < α = 0.05)
independent_qq() function from library(ssstats) to assess normality.Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?
Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?
In addition to normality, the two-sample t-test assumes equal variance between groups.
We can check this assumption and easily adjust if the assumption is broken.
Graphical method: scatterplot of residuals
Formal method: test for equal variances (Brown-Forsythe-Levine)
plot_residuals() function from library(ssstats) to graphically assess the assumption of equal variance.Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?
Let’s now look at the normality assumption for our example.
Our updated code:
If we believe the assumption may be violated, we can test for equal variance using the Brown-Forsythe-Levine (BFL) test.
This test is valid for more than two groups (read: we will see it again!)
Hypotheses
F_0 = \frac{\sum_{i=1}^k n_i(\bar{z}_{i.}-\bar{z}_{..})^2/(k-1)}{\sum_{i=1}^k \sum_{j=1}^{n_i} (z_{ij}-\bar{z}_{i.})^2/(N-k)} \sim F_{\text{df}_{\text{num}}, \text{df}_{\text{den}}}
Note that the BFL is a one-tailed test, which is different than when we are testing means using the t distribution.
p-value:
p = P\left[F_{\text{df}_{\text{num}}, \text{df}_{\text{den}}} \ge F_0\right]
\text{df}=\frac{ \left( \frac{s^2_1}{n_1} + \frac{s_2^2}{n_2} \right)^2 }{ \frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1} }
The t-tests we have already learned are considered parametric methods.
Nonparametric methods do not have distributional assumptions.
Why don’t we always use nonparametric methods?
They are often less efficient: a larger sample size is required to achieve the same probability of a Type I error.
They discard useful information :(
In the nonparametric tests we will be learning, the data will be ranked.
Let us first consider a simple example, x: \ 1, 7, 10, 2, 6, 8
Our first step is to reorder the data: x: \ 1, 2, 6, 7, 8, 10
Then, we replace with the ranks: R: \ 1, 2, 3, 4, 5, 6
What if all data values are not unique? We will assign the average rank for that group.
For example, x: \ 9, 8, 8, 0, 3, 4, 4, 8
Let’s reorder:x: \ 0, 3, 4, 4, 8, 8, 8, 9
Rank ignoring ties:R: \ 1, 2, 3, 4, 5, 6, 7, 8
Now, the final rank:R: \ 1, 2, 3.5, 3.5, 6, 6, 6, 8
Hypotheses
Test Statistic & p-Value
Rejection Region
Conclusion/Interpretation
[Reject or fail to reject] H_0.
There [is or is not] sufficient evidence to suggest [alternative hypothesis in words].
Before ranking, we will find the difference between the paired observations and eliminate any 0 differences.
Note 1: elimniating 0 differences is the big difference between the other tests!
Note 2: because we are eliminating 0 differences, this means that our sample size will update to the number of pairs with a non-0 difference.
When ranking, we the differences are ranked based on the absolute value of the difference.
We also keep the sign of the difference.
| X | Y | D | |D| | Rank |
|---|---|---|---|---|
| 5 | 8 | -3 | 3 | - 1.5 |
| 8 | 5 | 3 | 3 | + 1.5 |
| 4 | 4 | 0 | 0 | ——— |
Hypotheses
Test Statistic & p-Value
Rejection Region
Conclusion/Interpretation
[Reject or fail to reject] H_0.
There [is or is not] sufficient evidence to suggest [alternative hypothesis in words].
STA4173 - Biostatistics - Summer 2025